132 research outputs found

    Complexity of Lexical Descriptions and its Relevance to Partial Parsing

    Get PDF
    In this dissertation, we have proposed novel methods for robust parsing that integrate the flexibility of linguistically motivated lexical descriptions with the robustness of statistical techniques. Our thesis is that the computation of linguistic structure can be localized if lexical items are associated with rich descriptions (supertags) that impose complex constraints in a local context. However, increasing the complexity of descriptions makes the number of different descriptions for each lexical item much larger and hence increases the local ambiguity for a parser. This local ambiguity can be resolved by using supertag co-occurrence statistics collected from parsed corpora. We have explored these ideas in the context of Lexicalized Tree-Adjoining Grammar (LTAG) framework wherein supertag disambiguation provides a representation that is an almost parse. We have used the disambiguated supertag sequence in conjunction with a lightweight dependency analyzer to compute noun groups, verb groups, dependency linkages and even partial parses. We have shown that a trigram-based supertagger achieves an accuracy of 92.1‰ on Wall Street Journal (WSJ) texts. Furthermore, we have shown that the lightweight dependency analysis on the output of the supertagger identifies 83‰ of the dependency links accurately. We have exploited the representation of supertags with Explanation-Based Learning to improve parsing effciency. In this approach, parsing in limited domains can be modeled as a Finite-State Transduction. We have implemented such a system for the ATIS domain which improves parsing eciency by a factor of 15. We have used the supertagger in a variety of applications to provide lexical descriptions at an appropriate granularity. In an information retrieval application, we show that the supertag based system performs at higher levels of precision compared to a system based on part-of-speech tags. In an information extraction task, supertags are used in specifying extraction patterns. For language modeling applications, we view supertags as syntactically motivated class labels in a class-based language model. The distinction between recursive and non-recursive supertags is exploited in a sentence simplification application

    Pi-Calculus: A Unifying Framework for Programming Paradigms

    Get PDF
    π-calculus is a calculus for modeling dynamically changing configurations of a network of communicating agents. This paper studies the suitability of π-calculus as a unifying framework to model the operational semantics of the three paradigms of programming: functional, logic and imperative paradigms. In doing so, the attempt is to demonstrate that π-calculus models a primitive that is pervasive in the three paradigms and to illustrate that the three forms of sequential computing are special instances of concurrent computing

    Bootstrapping Multilingual Intent Models via Machine Translation for Dialog Automation

    Get PDF
    With the resurgence of chat-based dialog systems in consumer and enterprise applications, there has been much success in developing data-driven and rule-based natural language models to understand human intent. Since these models require large amounts of data and in-domain knowledge, expanding an equivalent service into new markets is disrupted by language barriers that inhibit dialog automation. This paper presents a user study to evaluate the utility of out-of-the-box machine translation technology to (1) rapidly bootstrap multilingual spoken dialog systems and (2) enable existing human analysts to understand foreign language utterances. We additionally evaluate the utility of machine translation in human assisted environments, where a portion of the traffic is processed by analysts. In English->Spanish experiments, we observe a high potential for dialog automation, as well as the potential for human analysts to process foreign language utterances with high accuracy.Comment: 6 pages, 3 figures, accepted for publication at the 2018 European Association for Machine Translation Conference (EAMT 2018

    Introduction to the special issue on cross-language algorithms and applications

    Get PDF
    With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version
    • …
    corecore